NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Adjudicating LLMs as PropBank Adjudicators

Bonn, Julia; Madabushi, Harish Tayyar; Hwang, Jena D; Bonial, Claire (May 2024, ELRA and ICCL)
Bonial, Claire; Bonn, Julia; Hwang, Jena D (Ed.)
We evaluate the ability of large language models (LLMs) to provide PropBank semantic role label annotations across different realizations of the same verbs in transitive, intransitive, and middle voice constructions. In order to assess the meta-linguistic capabilities of LLMs as well as their ability to glean such capabilities through in-context learning, we evaluate the models in a zero-shot setting, in a setting where it is given three examples of another verb used in transitive, intransitive, and middle voice constructions, and finally in a setting where it is given the examples as well as the correct sense and roleset information. We find that zero-shot knowledge of PropBank annotation is almost nonexistent. The largest model evaluated, GPT-4, achieves the best performance in the setting where it is given both examples and the correct roleset in the prompt, demonstrating that larger models can ascertain some meta-linguistic capabilities through in-context learning. However, even in this setting, which is simpler than the task of a human in PropBank annotation, the model achieves only 48% accuracy in marking numbered arguments correctly. To ensure transparency and reproducibility, we publicly release our dataset and model responses.
more » « less
Full Text Available
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

https://doi.org/10.18653/v1/2023.findings-acl.392

Zhou, Xuhui; Zhu, Hao; Yerukola, Akhila; Davidson, Thomas; Hwang, Jena D.; Swayamdipta, Swabha; Sap, Maarten (January 2023, Findings of the Association for Computational Linguistics: ACL 2023)

Full Text Available
SOCIAL CHEMISTRY 101: Learning to Reason about Social and Moral Norms

Forbes, Maxwell; Hwang, Jena D; Shwartz, Vered; Sap, Maarten; Choi, Yejin (October 2020, EMNLP)

Full Text Available
On-the-Fly Attention Modulation for Neural Generation

https://doi.org/10.18653/v1/2021.findings-acl.107

Dong, Yue; Bhagavatula, Chandra; Lu, Ximing; Hwang, Jena D.; Bosselut, Antoine; Cheung, Jackie Chi; Choi, Yejin (January 2021, Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021)

Despite considerable advancements with deep neural language models (LMs), neural text generation still suffers from degeneration: the generated text is repetitive, generic, selfcontradictory, and often lacks commonsense. Our analyses on sentence-level attention patterns in LMs reveal that neural degeneration may be associated with insufficient learning of task-specific characteristics by the attention mechanism. This finding motivates onthe-fly attention modulation1– a simple but effective method that enables the injection of priors into attention computation during inference. Automatic and human evaluation results on three text generation benchmarks demonstrate that attention modulation helps LMs generate text with enhanced fluency, creativity, and commonsense reasoning, in addition to significantly reduce sentence-level repetition.
more » « less
Full Text Available
Edited Media Understanding: Reasoning About Implications of Manipulated Images

Da, Jeff; Forbes, Maxwell; Zellers, Rowan; Zheng, Anthony; Hwang, Jena D.; Bosselut, Antoine; Choi, Yejin (December 2020, Association for Computational Linguistics)

Multimodal disinformation, from `deepfakes' to simple edits that deceive, is an important societal problem. Yet at the same time, the vast majority of media edits are harmless -- such as a filtered vacation photo. The difference between this example, and harmful edits that spread disinformation, is one of intent. Recognizing and describing this intent is a major challenge for today's AI systems. We present the task of Edited Media Understanding, requiring models to answer open-ended questions that capture the intent and implications of an image edit. We introduce a dataset for our task, EMU, with 48k question-answer pairs written in rich natural language. We evaluate a wide variety of vision-and-language models for our task, and introduce a new model PELICAN, which builds upon recent progress in pretrained multimodal representations. Our model obtains promising results on our dataset, with humans rating its answers as accurate 40.35% of the time. At the same time, there is still much work to be done -- humans prefer human-annotated captions 93.56% of the time -- and we provide analysis that highlights areas for further progress.
more » « less
Full Text Available
K-SNACS: Annotating Korean Adposition Semantics

Hwang, Jena D.; Choe, Hanwool; Han, Na-Rae; Schneider, Nathan (January 2020, Proceedings of the Second International Workshop on Designing Meaning Representations)

While many languages use adpositions to encode semantic relationships between content words in a sentence (e.g., agentivity or temporality), the details of how adpositions work vary widely across languages with respect to both form and meaning. In this paper, we empirically adapt the SNACS framework (Schneider et al., 2018) to Korean, a language that is typologically distant from English—the language SNACS was based on. We apply the SNACS framework to annotate the highly popular novella The Little Prince with semantic supersense labels over allKorean postpositions. Thus, we introduce the first broad-coverage corpus annotated with Korean postposition semantics and provide a detailed analysis of the corpus with an apples-to-apples comparison between Korean and English annotations.
more » « less
Full Text Available
Preparing SNACS for Subjects and Objects

https://doi.org/10.18653/v1/W19-3316

Shalev, Adi; Hwang, Jena D.; Schneider, Nathan; Srikumar, Vivek; Abend, Omri; Rappoport, Ari (January 2019, Proceedings of the First International Workshop on Designing Meaning Representations)

Full Text Available

Search for: All records